A Hybrid Hmm/traps Model for Robust

نویسندگان

  • Brian Kingsbury
  • Pratibha Jain
چکیده

We present three voice activity detection (VAD) algorithms that are suitable for the off-line processing of noisy speech and compare their performance on SPINE-2 evaluation data using speech recognition error rate as the quality metric. One VAD system is a simple HMM-based segmenter that uses normalized log-energy and a degree of voicing measure as raw features. The other two VAD systems focus on frequency-localized temporal information in the speech signal using a TempoRAl PatternS (TRAPS) classifier. They differ only in the processing of the TRAPS output. One VAD system uses median filtering to generate segment hypotheses, while the other is a hybrid system that uses a Viterbi search identical to that used in the HMM segmenter. Recognition on the hybrid HMM/TRAPS segmentation is more accurate than recognition on the other two segmentations by 1% absolute. This difference is statistically significant at a 99% confidence level according to a matched pairs sentence-segment word error test.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid HMM/traps model for robust voice activity detection

We present three voice activity detection (VAD) algorithms that are suitable for the off-line processing of noisy speech and compare their performance on SPINE-2 evaluation data using speech recognition error rate as the quality metric. One VAD system is a simple HMM-based segmenter that uses normalized log-energy and a degree of voicing measure as raw features. The other two VAD systems focus ...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Comparison of Smoothing Techniques for Robust Context Dependent Acoustic Modelling in Hybrid NN/HMM Systems

Hybrid Neural Network/Hidden Markov Model (NN/HMM) systems have been found to yield high quality phone recognition performance. One issue with modelling the Context Dependent (CD) NN/HMM is the robust estimation of the NN parameters to reliably predict the large number of CD state posteriors. Previously, factorization based on conditional probabilities has been commonly adopted to circumvent th...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

MAN-MACHINE INTERACTION SYSTEM FOR SUBJECT INDEPENDENT SIGN LANGUAGE RECOGNITION USING FUZZY HIDDEN MARKOV MODEL

Sign language recognition has spawned more and more interest in human–computer interaction society. The major challenge that SLR recognition faces now is developing methods that will scale well with increasing vocabulary size with a limited set of training data for the signer independent application. The automatic SLR based on hidden Markov models (HMMs) is very sensitive to gesture's shape inf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002